387 research outputs found

    GenomeGraphs: integrated genomic data visualization with R.

    Get PDF
    BackgroundBiological studies involve a growing number of distinct high-throughput experiments to characterize samples of interest. There is a lack of methods to visualize these different genomic datasets in a versatile manner. In addition, genomic data analysis requires integrated visualization of experimental data along with constantly changing genomic annotation and statistical analyses.ResultsWe developed GenomeGraphs, as an add-on software package for the statistical programming environment R, to facilitate integrated visualization of genomic datasets. GenomeGraphs uses the biomaRt package to perform on-line annotation queries to Ensembl and translates these to gene/transcript structures in viewports of the grid graphics package. This allows genomic annotation to be plotted together with experimental data. GenomeGraphs can also be used to plot custom annotation tracks in combination with different experimental data types together in one plot using the same genomic coordinate system.ConclusionGenomeGraphs is a flexible and extensible software package which can be used to visualize a multitude of genomic datasets within the statistical programming environment R

    Review of \u3ci\u3eSex, Murder, and the Unwritten Law: Courting Judicial Mayhem, Texas Style.\u3c/i\u3e By Bill Neal.

    Get PDF
    If, as has often been contended, truth is the first casualty of traditional warfare, then logic, it appears, is the first casualty of sexual warfare. And with that thematic statement in hand, author Bill Neal is off to the proverbial races with an often delightful, sometimes troubling, and generally entertaining legal discourse on the so-called unwritten law : that a cuckolded husband or a woman wronged has the God-given right to avenge or be avenged, even to redress by murder. With a curiously dispassionate, or at least overly serious, foreword by Cal State-Fullerton professor Gordon Morris Bakken, Neal\u27s tales of adultery, murder, and boundlessly ridiculous not guilty verdicts cross several decades from the 1880s over a North Texas path of tornadic sex-and- revenge events. Looking back through centuries of legal cases and precedents from which this unwritten law evolved, Neal considers six cases that more or less represent that evolution as it stood along the Red River at the turn into the 20th century and beyond. In each, the tragic event itself is played out against the trial that followed, replete with the actual testimony and strategies that eventually produced a stunning but somehow not surprising verdict exonerating the victim of the adulterous conduct from the charge of murder. In the epilogues for each tale, the author follows the primary figures into their respective futures, offering an occasional perspective on the impact of the judgment rendered

    Integrating biological knowledge into variable selection : an empirical Bayes approach with an application in cancer biology

    Get PDF
    Background: An important question in the analysis of biochemical data is that of identifying subsets of molecular variables that may jointly influence a biological response. Statistical variable selection methods have been widely used for this purpose. In many settings, it may be important to incorporate ancillary biological information concerning the variables of interest. Pathway and network maps are one example of a source of such information. However, although ancillary information is increasingly available, it is not always clear how it should be used nor how it should be weighted in relation to primary data. Results: We put forward an approach in which biological knowledge is incorporated using informative prior distributions over variable subsets, with prior information selected and weighted in an automated, objective manner using an empirical Bayes formulation. We employ continuous, linear models with interaction terms and exploit biochemically-motivated sparsity constraints to permit exact inference. We show an example of priors for pathway- and network-based information and illustrate our proposed method on both synthetic response data and by an application to cancer drug response data. Comparisons are also made to alternative Bayesian and frequentist penalised-likelihood methods for incorporating network-based information. Conclusions: The empirical Bayes method proposed here can aid prior elicitation for Bayesian variable selection studies and help to guard against mis-specification of priors. Empirical Bayes, together with the proposed pathway-based priors, results in an approach with a competitive variable selection performance. In addition, the overall procedure is fast, deterministic, and has very few user-set parameters, yet is capable of capturing interplay between molecular players. The approach presented is general and readily applicable in any setting with multiple sources of biological prior knowledge

    Genomic aberrations in normal tissue adjacent to HER2-amplified breast cancers: field cancerization or contaminating tumor cells?

    Get PDF
    Field cancerization effects as well as isolated tumor cell foci extending well beyond the invasive tumor margin have been described previously to account for local recurrence rates following breast conserving surgery despite adequate surgical margins and breast radiotherapy. To look for evidence of possible tumor cell contamination or field cancerization by genetic effects, a pilot study (Study 1: 12 sample pairs) followed by a verification study (Study 2: 20 sample pairs) were performed on DNA extracted from HER2-positive breast tumors and matching normal adjacent mammary tissue samples excised 1-3 cm beyond the invasive tumor margin. High-resolution molecular inversion probe (MIP) arrays were used to compare genomic copy number variations, including increased HER2 gene copies, between the paired samples; as well, a detailed histologic and immunohistochemical (IHC) re-evaluation of all Study 2 samples was performed blinded to the genomic results to characterize the adjacent normal tissue composition bracketing the DNA-extracted samples. Overall, 14/32 (44 %) sample pairs from both studies produced genome-wide evidence of genetic aberrations including HER2 copy number gains within the adjacent normal tissue samples. The observed single-parental origin of monoallelic HER2 amplicon haplotypes shared by informative tumor-normal pairs, as well as commonly gained loci elsewhere on 17q, suggested the presence of contaminating tumor cells in the genomically aberrant normal samples. Histologic and IHC analyses identified occult 25-200 Ī¼m tumor cell clusters overexpressing HER2 scattered in more than half, but not all, of the genomically aberrant normal samples re-evaluated, but in none of the genomically normal samples. These genomic and microscopic findings support the conclusion that tumor cell contamination rather than genetic field cancerization represents the likeliest cause of local clinical recurrence rates following breast conserving surgery, and mandate caution in assuming the genomic normalcy of histologically benign appearing peritumor breast tissue

    The XBabelPhish MAGE-ML and XML Translator

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MAGE-ML has been promoted as a standard format for describing microarray experiments and the data they produce. Two characteristics of the MAGE-ML format compromise its use as a universal standard: First, MAGE-ML files are exceptionally large ā€“ too large to be easily read by most people, and often too large to be read by most software programs. Second, the MAGE-ML standard permits many ways of representing the same information. As a result, different producers of MAGE-ML create different documents describing the same experiment and its data. Recognizing all the variants is an unwieldy software engineering task, resulting in software packages that can read and process MAGE-ML from some, but not all producers. This Tower of MAGE-ML Babel bars the unencumbered exchange of microarray experiment descriptions couched in MAGE-ML.</p> <p>Results</p> <p>We have developed XBabelPhish ā€“ an XQuery-based technology for translating one MAGE-ML variant into another. XBabelPhish's use is not restricted to translating MAGE-ML documents. It can transform XML files independent of their DTD, XML schema, or semantic content. Moreover, it is designed to work on very large (> 200 Mb.) files, which are common in the world of MAGE-ML.</p> <p>Conclusion</p> <p>XBabelPhish provides a way to inter-translate MAGE-ML variants for improved interchange of microarray experiment information. More generally, it can be used to transform most XML files, including very large ones that exceed the capacity of most XML tools.</p

    A robust prognostic signature for hormone-positive node-negative breast cancer

    Get PDF
    BACKGROUND: Systemic chemotherapy in the adjuvant setting can cure breast cancer in some patients that would otherwise recur with incurable, metastatic disease. However, since only a fraction of patients would have recurrence after surgery alone, the challenge is to stratify high-risk patients (who stand to benefit from systemic chemotherapy) from low-risk patients (who can safely be spared treatment related toxicities and costs). METHODS: We focus here on risk stratification in node-negative, ER-positive, HER2-negative breast cancer. We use a large database of publicly available microarray datasets to build a random forests classifier and develop a robust multi-gene mRNA transcription-based predictor of relapse free survival at 10Ā years, which we call the Random Forests Relapse Score (RFRS). Performance was assessed by internal cross-validation, multiple independent data sets, and comparison to existing algorithms using receiver-operating characteristic and Kaplan-Meier survival analysis. Internal redundancy of features was determined using k-means clustering to define optimal signatures with smaller numbers of primary genes, each with multiple alternates. RESULTS: Internal OOB cross-validation for the initial (full-gene-set) model on training data reported an ROC AUC of 0.704, which was comparable to or better than those reported previously or obtained by applying existing methods to our dataset. Three risk groups with probability cutoffs for low, intermediate, and high-risk were defined. Survival analysis determined a highly significant difference in relapse rate between these risk groups. Validation of the models against independent test datasets showed highly similar results. Smaller 17-gene and 8-gene optimized models were also developed with minimal reduction in performance. Furthermore, the signature was shown to be almost equally effective on both hormone-treated and untreated patients. CONCLUSIONS: RFRS allows flexibility in both the number and identity of genes utilized from thousands to as few as 17 or eight genes, each with multiple alternatives. The RFRS reports a probability score strongly correlated with risk of relapse. This score could therefore be used to assign systemic chemotherapy specifically to those high-risk patients most likely to benefit from further treatment

    The Cell Cycleā€“Regulated Genes of Schizosaccharomyces pombe

    Get PDF
    Many genes are regulated as an innate part of the eukaryotic cell cycle, and a complex transcriptional network helps enable the cyclic behavior of dividing cells. This transcriptional network has been studied in Saccharomyces cerevisiae (budding yeast) and elsewhere. To provide more perspective on these regulatory mechanisms, we have used microarrays to measure gene expression through the cell cycle of Schizosaccharomyces pombe (fission yeast). The 750 genes with the most significant oscillations were identified and analyzed. There were two broad waves of cell cycle transcription, one in early/mid G2 phase, and the other near the G2/M transition. The early/mid G2 wave included many genes involved in ribosome biogenesis, possibly explaining the cell cycle oscillation in protein synthesis in S. pombe. The G2/M wave included at least three distinctly regulated clusters of genes: one large cluster including mitosis, mitotic exit, and cell separation functions, one small cluster dedicated to DNA replication, and another small cluster dedicated to cytokinesis and division. S. pombe cell cycle genes have relatively long, complex promoters containing groups of multiple DNA sequence motifs, often of two, three, or more different kinds. Many of the genes, transcription factors, and regulatory mechanisms are conserved between S. pombe and S. cerevisiae. Finally, we found preliminary evidence for a nearly genome-wide oscillation in gene expression: 2,000 or more genes undergo slight oscillations in expression as a function of the cell cycle, although whether this is adaptive, or incidental to other events in the cell, such as chromatin condensation, we do not know

    Dense geographic and genomic sampling reveals paraphyly and a cryptic lineage in a classic sibling species complex

    Get PDF
    Incomplete or geographically biased sampling poses significant problems for research in phylogeography, population genetics, phylogenetics, and species delimitation. Despite the power of using genome-wide genetic markers in systematics and related fields, approaches such as the multispecies coalescent remain unable to easily account for unsampled lineages. The Empidonax difficilis/Empidonax occidentalis complex of small tyrannid flycatchers (Aves: Tyrannidae) is a classic example of widely distributed species with limited phenotypic geographic variation that was broken into two largely cryptic (or "sibling") lineages following extensive study. Though the group is well-characterized north of the US Mexico border, the evolutionary distinctiveness and phylogenetic relationships of southern populations remain obscure. In this article, we use dense genomic and geographic sampling across the majority of the range of the E. difficilis/E. occidentalis complex to assess whether current taxonomy and species limits reflect underlying evolutionary patterns, or whether they are an artifact of historically biased or incomplete sampling. We find that additional samples from Mexico render the widely recognized species-level lineage E. occidentalis paraphyletic, though it retains support in the best-fit species delimitation model from clustering analyses. We further identify a highly divergent unrecognized lineage in a previously unsampled portion of the group's range, which a cline analysis suggests is more reproductively isolated than the currently recognized species E. difficilis and E. occidentalis. Our phylogeny supports a southern origin of these taxa. Our results highlight the pervasive impacts of biased geographic sampling, even in well-studied vertebrate groups like birds, and illustrate what is a common problem when attempting to define species in the face of recent divergence and reticulate evolution
    • ā€¦
    corecore